Skip to content

Conformal Prediction Wrappers #184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Conformal Prediction Wrappers #184

wants to merge 21 commits into from

Conversation

soulios-basf
Copy link
Collaborator

Overview:
This PR introduces several new files to support conformal prediction in molecular machine learning pipelines, as well as comprehensive unit tests for pipeline and conformal prediction functionality.

conformal.py: Implements UnifiedConformalCV(single-model) and CrossConformalCV(aggregate CP) wrappers for easy conformal prediction (classification/regression) on top of scikit-learn models.
test_conformal.py: Unit tests for the conformal wrappers, covering both regression and classification.
test_pipeline.py: Unit tests for the main pipeline, including integration with conformal prediction.
advanced_04_conformal_prediction.ipynb: Example notebook showing conformal prediction on molecular data, with benchmarking and visualization.

How to test:
Run pytest on the new test files to verify correctness.
Open and run all cells in the notebook to see conformal prediction in action.

Copy link
Collaborator

@JochenSiegWork JochenSiegWork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first part of my review. I mainly looked at conformal.py and only shortly on the test.

Copy link
Collaborator

@JochenSiegWork JochenSiegWork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I reviewed conformal.py and the tests. In the next iteration I'll review the notebook.

The tests are well-written :) I only got some comments and requests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to add these globally to the .gitignore, you shouldn't do this in this PR but create a separate one.

Self.

Raises
------
ValueError
If estimator_type is not 'classifier' or 'regressor'.
For invalid types and uninitialized.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a word missing in this sentence?


def predict(self, x: NDArray[Any]) -> NDArray[Any]:
if self.estimator_type not in {"classifier", "regressor"}:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be necessary to check the content of self.estimator_type again since you validated it's content in the __init__

"""
if not self.fitted_ or self._conformal is None:
raise ValueError("Estimator must be fitted before calling predict")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure: It's not necessary to calibrate before predicting?

n_bins = self.binning

def bin_func(
x_test: Any,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more specific type hints possible here. Maybe like this?

            x_test: npt.NDArray[Any],
            model: BaseEstimator = model,
            y_min: float = y_min,
            y_max: float = y_max,
            n_bins: int = n_bins,

self.assertIsInstance(pred_set, list)
for class_idx in pred_set:
self.assertIsInstance(class_idx, (int, np.integer))
self.assertGreaterEqual(class_idx, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also check <2?

self.assertIsInstance(class_idx, (int, np.integer))
self.assertGreaterEqual(class_idx, 0)

self.assertTrue(np.all(p_values_custom >= 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These bounds should also be checked int the tests that generate p-values

# Mondrian should give different results than baseline
self.assertFalse(np.array_equal(intervals_mondrian, intervals_baseline))

def test_cross_conformal_mondrian_both_classes(self) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what this test is testing. The test is called "both_classes". What does that mean and why is regression than also checked?

)


class ConformalPredictor(BaseEstimator): # pylint: disable=too-many-instance-attributes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also discuss later with Christian:

In sklearn usually you have separate classes for regression and classification, e.g. RandomForest{Regressor,Classifier}. Since there are quite some arguments and functions in the conformal classes that work for one of those but not for the other, it would actually make sense to have something like ConformalRegressor, ConformatClassifier, CrossConformalRegressor and CrossConformalClassifier. This would reduce the if/else checking for the estimator_type. But this is a more general interface decision, we can discuss with Christian later. Since we are planing to merge the conformal code into experimental it's also fine to change the interface later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test already look quite good. I think there are two more general tests missing that are necessary to ensure interoperability with the general MolPipeline workflows:

  1. Make a test where a Pipeline is wrapped by the conformal classes. Here is an example with the CalibratedClassifierCV
    def test_calibrated_classifier(self) -> None:
  2. Make a test that uses the Conformal classes as part of a pipeline, like done here https://github.com/basf/neural-fingerprint-uncertainty/blob/69c4dde201d43c7d5242805e3fcb47af54d0b101/scripts/03_ml_experiments.py#L141
  3. Make a test to check if serialization and deserialization works, when you dump the model and read it back in again. This should be done in all 3 variations with just sklearn estimator and the Pipeline variations in 1 & 2.
  • You can use the recursive_to_json and recursive_from_json function to test json serialization, like here
    json_str = recursive_to_json(m_pipeline)
  • Analogously, you should write a test where you dump the trained conformal classes with joblib and then read them back in again and check if they are equal.

These 3 things are important to use the two conformal classes in our downstream workflows. We can also have a call to discuss the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants